Skip to content

Conversation

mgrazianoc
Copy link
Contributor

@mgrazianoc mgrazianoc commented Jun 16, 2025

Rationale within the changes

This PR refactors and extends support for nested types in the Arrow integration. The current implementation of ArrowNestedType is tailored primarily for data structs, as seen in StructBufferBuilder. However, it lacks broader support and certain expected functionalities, such as loadStructArrayBuilder.

To address this, the following improvements have been made:

  • Renamed ArrowNestedType to ArrowTypeStruct to align with naming conventions used elsewhere in the codebase.
  • Introduced initial support for ArrowTypeList, including nested lists.

For simplicity, instead of introducing a dedicated subtype for lists, this PR uses an interface of [Any?]?. If this approach proves insufficient, there are more explicit alternatives that can be explored.

NOTE: Work on ArrowCExporter and ArrowCImporter has been intentionally deferred. These components require a deeper understanding of memory ownership and child parsing, and I believe it's better to be addressed in a future PR, unless it's strict necessary.

What's Changed

  1. Renamed ArrowNestedType -> ArrowTypeStruct.
  2. Added support for ArrowTypeList, including nested lists.
  3. Implemented ListArray with basic .asString formatting.
  4. Added ListArrayBuilder.
  5. Extended ArrowArrayBuilder to support the .list type.
  6. Implemented loadStructArrayBuilder and loadListArrayBuilder.
  7. Introduced ListBufferBuilder.
  8. Added ArrowReader.loadListData.
  9. Added makeListHolder.

Are these changes tested?

Tests are included in ArrayTests.swift. It's also working on internal applications, including integration with ArrowFlight.

Closes #16.

@mgrazianoc mgrazianoc force-pushed the mgraziano/list-support branch from 28f225d to 0253f70 Compare June 17, 2025 17:35
@kou kou requested a review from Copilot June 18, 2025 08:12
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR extends the Arrow integration to support List data types while refactoring nested type handling. Key changes include renaming ArrowNestedType to ArrowTypeStruct, adding a new ArrowTypeList and corresponding builders/arrays, and updating relevant tests to cover both primitive and nested lists.

Reviewed Changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
Arrow/Tests/ArrowTests/TableTests.swift Updated closure signature and return type for struct builders
Arrow/Tests/ArrowTests/IPCTests.swift Updated schema construction to use the new ArrowTypeStruct
Arrow/Tests/ArrowTests/ArrayTests.swift Added tests for ListArray functionality with both primitive and nested lists
Arrow/Sources/Arrow/ProtoUtil.swift Refactored type creation for struct and added support for list fields
Arrow/Sources/Arrow/ArrowWriter.swift Updated list handling by replacing ArrowNestedType with ArrowTypeStruct
Arrow/Sources/Arrow/ArrowType.swift Renamed ArrowNestedType → ArrowTypeStruct and added ArrowTypeList
Arrow/Sources/Arrow/ArrowReaderHelper.swift & ArrowReader.swift Added list data loading functions
Arrow/Sources/Arrow/ArrowBufferBuilder.swift Replaced nested type usage and added ListBufferBuilder
Arrow/Sources/Arrow/ArrowArrayBuilder.swift Introduced ListArrayBuilder and updated builder resolution
Arrow/Sources/Arrow/ArrowArray.swift Added ListArray implementation with subscript and asString

@kou kou changed the title GH-16: Add support for List data types feat: Add support for List data types Jun 18, 2025
@kou
Copy link
Member

kou commented Jun 18, 2025

@abandy You may want to review this.

@abandy
Copy link
Contributor

abandy commented Jun 18, 2025

@abandy You may want to review this.

Will do, thanks!

@mgrazianoc mgrazianoc marked this pull request as draft August 14, 2025 15:13
@mgrazianoc mgrazianoc force-pushed the mgraziano/list-support branch 2 times, most recently from 0f11175 to c4f7ef0 Compare August 14, 2025 16:07
@mgrazianoc mgrazianoc marked this pull request as ready for review August 14, 2025 17:35
@mgrazianoc
Copy link
Contributor Author

@abandy and @kou, sorry for the delay, priorities moved to other places. I think I've managed to resolve the issues, considering what could be more appropriate for the implementation details and interface of the code.

Any questions, please let me know.

@abandy
Copy link
Contributor

abandy commented Aug 17, 2025

@abandy and @kou, sorry for the delay, priorities moved to other places. I think I've managed to resolve the issues, considering what could be more appropriate for the implementation details and interface of the code.

Any questions, please let me know.

@mgrazianoc Initial review it lgtm! I will try to find sometime later this week to go over again.

@mgrazianoc
Copy link
Contributor Author

Hi @abandy , do you think we could have this merged by the next weeks? Anything I can help, please let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Swift] Support list (Array) type
3 participants